8 research outputs found

    Methods for efficient resource utilization in statistical machine learning algorithms

    Get PDF
    In recent years, statistical machine learning has emerged as a key technique for tackling problems that elude a classic algorithmic approach. One such problem, with a major impact on human life, is the analysis of complex biomedical data. Solving this problem in a fast and efficient manner is of major importance, as it enables, e.g., the prediction of the efficacy of different drugs for therapy selection. While achieving the highest possible prediction quality appears desirable, doing so is often simply infeasible due to resource constraints. Statistical learning algorithms for predicting the health status of a patient or for finding the best algorithm configuration for the prediction require an excessively high amount of resources. Furthermore, these algorithms are often implemented with no awareness of the underlying system architecture, which leads to sub-optimal resource utilization. This thesis presents methods for efficient resource utilization of statistical learning applications. The goal is to reduce the resource demands of these algorithms to meet a given time budget while simultaneously preserving the prediction quality. As a first step, the resource consumption characteristics of learning algorithms are analyzed, as well as their scheduling on underlying parallel architectures, in order to develop optimizations that enable these algorithms to scale to larger problem sizes. For this purpose, new profiling mechanisms are incorporated into a holistic profiling framework. The results show that one major contributor to the resource issues is memory consumption. To overcome this obstacle, a new optimization based on dynamic sharing of memory is developed that speeds up computation by several orders of magnitude in situations when available main memory is the bottleneck, leading to swapping out memory. One important application that can be applied for automated parameter tuning of learning algorithms is model-based optimization. Within a huge search space, algorithm configurations are evaluated to find the configuration with the best prediction quality. An important step towards better managing this search space is to parallelize the search process itself. However, a high runtime variance within the configuration space can cause inefficient resource utilization. For this purpose, new resource-aware scheduling strategies are developed that efficiently map evaluations of configurations to the parallel architecture, depending on their resource demands. In contrast to classical scheduling problems, the new scheduling interacts with the configuration proposal mechanism to select configurations with suitable resource demands. With these strategies, it becomes possible to make use of the full potential of parallel architectures. Compared to established parallel execution models, the results show that the new approach enables model-based optimization to converge faster to the optimum within a given time budget

    A Real-Time Thermal Sensor System for Quantifying the Inhibitory Effect of Antimicrobial Peptides on Bacterial Adhesion and Biofilm Formation

    Get PDF
    The increasing rate of antimicrobial resistance (AMR) in pathogenic bacteria is a global threat to human and veterinary medicine. Beyond antibiotics, antimicrobial peptides (AMPs) might be an alternative to inhibit the growth of bacteria, including AMR pathogens, on different surfaces. Biofilm formation, which starts out as bacterial adhesion, poses additional challenges for antibiotics targeting bacterial cells. The objective of this study was to establish a real-time method for the monitoring of the inhibition of (a) bacterial adhesion to a defined substrate and (b) biofilm formation by AMPs using an innovative thermal sensor. We provide evidence that the thermal sensor enables continuous monitoring of the effect of two potent AMPs, protamine and OH-CATH-30, on surface colonization of bovine mastitis-associated Escherichia (E.) coli and Staphylococcus (S.) aureus. The bacteria were grown under static conditions on the surface of the sensor membrane, on which temperature oscillations generated by a heater structure were detected by an amorphous germanium thermistor. Bacterial adhesion, which was confirmed by white light interferometry, caused a detectable amplitude change and phase shift. To our knowledge, the thermal measurement system has never been used to assess the effect of AMPs on bacterial adhesion in real time before. The system could be used to screen and evaluate bacterial adhesion inhibition of both known and novel AMPs

    Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization

    No full text
    Parallel computing is becoming more and more popular, since R is increasingly used to process large data sets. We therefore have improved traceR to allow for profiling parallel applications also. TraceR can be used for common cases like parallelization on multiple cores or parallelization on multiple machines. For the parallel performance analysis we added measurements like CPU utilization of parallel tasks and measurements for analyzing the memory usage of parallel programs during execution. With our parallel performance analysis we concentrate on applications that are embarrassingly par- allel consisting of independent tasks. One example application which is embarrassingly parallel and also has a high resource utilization is the model selection. Here the goal is to find the best machine learning algorithm configuration for building a model for the given data. Therefore one has to search through a huge model space. Since the gain from parallel execution can be negated if the memory requirements of all parallel processes exceed the capacity of the system, our profiling data can serve as a constraint to determine the degree of parallelism and also to guide distribution of parallel R applications. Our goal is to provide a resource-aware parallelization strategy. To develop such a strategy we first need to analyze the performance of parallel applications. In the following we therefore will describe different parallel example applications and show how traceR is applied to analyze parallel R applications
    corecore